Topics in semantic association

نویسندگان

  • Thomas L. Griffiths
  • Mark Steyvers
  • Joshua B. Tenenbaum
چکیده

Learning and using language requires retrieving concepts from memory in response to an ongoing stream of information. The human memory system solves this problem by using the gist of a sentence, conversation, or document to predict related concepts and disambiguate words. Two approaches to representing gist have dominated research on semantic representation: semantic networks and semantic spaces. We take a step back from these approaches, and analyze the abstract computational problem underlying the extraction and use of gist, formulating this problem in statistical terms. This analysis allows us to explore a novel approach to semantic representation, in which words are represented using a set of probabilistic topics. The topic model performs well in predicting word association, free recall, and the senses of words, and provides a foundation for developing richer statistical models of language. Learning, speaking, and understanding language all require solving a challenging computational problem: retrieving a variety of concepts from memory in response to an ongoing stream of information. The human memory system solves this problem by using the semantic context – the gist of a sentence, conversation, or document – to predict related concepts and disambiguate words. Online processing of sentences can be facilitated by predicting which concepts are likely to be relevant before they are needed. For example, if the word BANK appears in a sentence, it might become more likely that words like FEDERAL and RESERVE would also appear in that sentence, and this information could be used to initiate retrieval of the information related to these words. This preThis work was supported by a grant from the NTT Communication Sciences Laboratory. While completing this work, TLG was supported by a Stanford Graduate Fellowship, and JBT by the Paul E. Newton chair. We thank Touchstone Applied Sciences, Tom Landauer, and Darrell Laham for making the TASA corpus available, and for their thoughts on these topics. TOPICS IN SEMANTIC ASSOCIATION 2

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Impact of Semantic Clustering on Iranian EFL Advanced Learners’ Vocabulary Retention

This study investigated the impact of semantic clustering on Iranian EFL learners’ vocabulary retention at advanced level. Participants were female learners randomly assigned to two groups of 15. Four instruments (TOEFL test; vocabulary pretest; immediate posttest, and delayed recall posttest) were used. The experimental group underwent semantic clustering vocabulary presentation in which the l...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Detecting hot topics from Twitter: A multiview approach

Twitter is widely used all over the world, and a huge number of hot topics are generated by Twitter users in real time. These topics are able to reflect almost every aspect of people’s daily lives. Therefore, the detection of topics in Twitter can be used in many real applications, such as monitoring public opinion, hot product recommendation and incidence detection. However, the performance of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005